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Abstract 

The number of available algorithms to infer a biological network from a dataset of high-throughput measurements is 
overwhelming and keeps growing. However, evaluating their performance is unfeasible unless a 'gold standard' is available 
to measure how close the reconstructed network is to the ground truth. One measure of this is the stability of these 
predictions to data resampling approaches. We introduce NetSI, a family of Network Stability Indicators, to assess 
quantitatively the stability of a reconstructed network in terms of inference variability due to data subsampling. In order to 
evaluate network stability, the main NetSI methods use a global/local network metric in combination with a resampling 
(bootstrap or cross-validation) procedure. In addition, we provide two normalized variability scores over data resampling to 
measure edge weight stability and node degree stability, and then introduce a stability ranking for edges and nodes. A 
complete implementation of the NetSI indicators, including the Hamming-lpsen-Mikhailov (HIM) network distance adopted 
in this paper is available with the R package nettools. We demonstrate the use of the NetSI family by measuring network 
stability on four datasets against alternative network reconstruction methods. First, the effect of sample size on stability of 
inferred networks is studied in a gold standard framework on yeast-like data from the Gene Net Weaver simulator. We also 
consider the impact of varying modularity on a set of structurally different networks (50 nodes, from 2 to 10 modules), and 
then of complex feature covariance structure, showing the different behaviours of standard reconstruction methods based 
on Pearson correlation, Maximum Information Coefficient (MIC) and False Discovery Rate (FDR) strategy. Finally, we 
demonstrate a strong combined effect of different reconstruction methods and phenotype subgroups on a hepatocellular 
carcinoma miRNA microarray dataset (240 subjects), and we validate the analysis on a second dataset (166 subjects) with 
good reproducibility. 
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Introduction 

The problem of inferring a biological network structure given a 
set of high-throughput measurements, e.g. gene expression arrays, 
has been addressed by a large number of different methods 
published in the last fifteen years (see [1,2] for two recent 
comparative reviews). Solutions range from general purpose 
algorithms (such as correlation [3] or relevance networks [4]) to 
methods tailored ad hoc for specific data types. Recent examples 
include SeqSpider [5] for Next Generation Sequencing data, or 
Sparsity-aware Maximum Likelihood [6] for cis-expression quan- 
titative trait loci (cis-eQTL). 

However, network reconstruction is an underdetermined 
problem, since the number of interactions is significantly larger 
than the number of independent measurements [7]. Thus, all 
algorithms must aim to find a compromise between reconstruction 
accuracy and feasibility: simplifications inevitably detract from the 
precision of the final outcome by including a relevant number of 
false positive links [8], which should be discarded e.g., by 
identifying and removing unwanted indirect relations [9]. More- 
over, inference accuracy is strongly dependent on the assumptions 
used to choose the best hypothetical model of experimental 
observations [10]. 



These issues make the inference problem "a daunting task" [1 1] 
not only in terms of devising an effective algorithm, but also in 
terms of quantitatively interpreting the results obtained. In 
general, reconstruction accuracy is far from optimal in many 
situations and several pitfalls may occur [12], related to both the 
methods and the data [13]. In extreme cases, many link 
predictions are statistically equivalent to random guesses [14]. In 
particular, it is now widely acknowledged that the size and quality 
of the data play a critical role in the inference process [15-18]. All 
these considerations support the opinion that network reconstruc- 
tion should still be regarded as an unsolved problem [19]. 

Given the growing list of available algorithms, efforts have been 
made to develop methods for the objective comparison of network 
inference methods including the identification of current limita- 
tions [20,21] and their relative strengths and disadvantages [7,22]. 
The most systematic effort is probably the international DREAM 
challenge [23]: from DREAM 2012 emerged a consensus 
advocating the integration of predictions from multiple inference 
methods as an effective strategy to enhance performance [24]. 
However, algorithm uncertainty has so far been assessed only in 
terms of performance, i.e., the distance of the reconstructing 
network from the ground truth, whenever available, while the 
stability of the methods has been neglected. When no gold 
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Figure 1 . HIM distance: contribution of H and IM. (A) An example on three 5-node networks mutually differing by two links. (B) An example on 
network G4, as defined in Subsection Stability is modularity invariant. G m /: network G4 without the four red links. G gnvn : network G4 without green 
links. Gblue: network G4 without blue links. (C) The mutual differences between the pairs of networks in (A), (Na,Nb) and (Na,Nc)- (D) (Gn,G m i), 
(G4,G gm ,„), (Gt,Ghi m ). In both cases they have the same Hamming distance but different spectral structure, thus resulting in different Ipsen-Mikhailov 
distances. 

doi:10.1371/journal.pone.0089815.g001 



standard is available for a given problem, there is no chance to 
evaluate algorithm accuracy. In such cases we can consider 
stability as a rule of thumb for judging the reliability of the 
resulting network. Obviously, the performance of a network 
reconstruction algorithm and the stability/reliability of the 
resulting network inferred from a specific dataset are two distinct 
and equally crucial aspects of the network inference process. The 



best way to optimize both aspects would be to adopt only network 
reconstruction algorithms with well characterized performance, 
i.e., evaluated in cases where the ground truth is known, and with 
stability always checked on specific data. It is also worthwhile 
noting that the evaluation of inference stability is not related to the 
(chemical or physical) "stability" of the represented process. 
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Figure 2. An example of HIM distance. Representation of the HIM distance in the Ipsen-Mikhailov (IM axis) and Hamming (H axis) distance space 
between networks A versus B, E and F, where E is the empty network and F is the fully connected one. 
doi:10.1371/journal.pone.0089815.g002 
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Given a dataset D with s samples and p features, reconstruct (with a chosen algorithm A) the network 
N D on the whole dataset D; denote the p nodes of N D by xf, ...,x% and its edges' weight by a% k , for 
k, h = 1, . . . ,p. 

Choose two integers n, r with n < s and r < (*), and build a set T> { „ T) = {D u . . . D r } where D f is a 
dataset built choosing » samples from D. 

Reconstruct, by using the same algorithm A, the corresponding networks A' D , on the subsampled data. 
Compute the NetSI family, where 5 is an appropriate network distance (e.g., S = HIM): 



Stability: S s (n,r) = meanCHi) 



Internal Stability: S^(n,r) = mean(% 2 ) 



Edge weight stability: S w (n, r, h, k) ■■ 



range(£") 



mean(i?) 



where E = {afa : 



l,...,r} and 



Node degree stability: S d (n, r, h) ■■ 



range(A) 



mcan(A) 



where = {6{N D , No,): i = 1 r] 

where H 2 = {<5( N Dl , N Dj ) : i, j = 1, . . . r, i ? j} 

for k,h = l,...,p 

range(-E) = | max(£) — min(£)| 

for h = 1,. . .,p 



where A = {d(x®') : i = 1, . . . , r} and range(A) = | max(A) - min(A)| 
d is the degree function (i.e., number of edges connecting the node x% 4 ). 

• For each set of values of the NetSI family compute the 95% studentized bootstrap confidence intervals 

• Compare the statistics of the NetSI family to describe the level of confidence (stability) in the network 
N D , in its links and in its nodes. 

Figure 3. Definition of the NetSI family. 

doi:1 0.1 371 /journal.pone.008981 5.g003 



We propose to tackle the stability issue by quantifying inference 
variability with respect to data perturbation, and, in particular, 
data subsampling. If a portion of data is randomly removed before 
inferring the network, the resulting graph is likely to be different 
from the one reconstructed from the whole dataset and, in general, 
different subsets of data would give rise to different networks. 
Thus, in the spirit of applying reproducibility principles to this 



field, one has to accept the compromise that the inferred/non 
inferred links are just an estimation, lying within a reasonable 
probability interval. Here we introduce the Network Stability 
Indicators (NetSI) family, a set of four indicators allowing the 
researcher to quantitatively evaluate the reproducibility of the 
reconstruction process. We propose to quantitatively assess, for a 
given ratio of removed data and for a given amount of (bootstrap 
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Figure 4. Graphical description of the pipeline in Fig. 3. Using the inference algorithm A, the network Nd is first reconstructed from the whole 
dataset D with s samples and /; features (nodes). Given two integers n,r, a set of r datasets D, is generated by choosing for each /' a subset of n 
samples from D, and the corresponding networks No, are inferred by A. Finally, the four indicators Ss(n,r), S' s (n,r), S w (n,r,h,k) and S,/(n,r,h) are 
computed according to their definition. 
doi:1 0.1 371 /journal.pone.008981 5.g004 
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Figure 5. Synthetic network with m modules, where m ranges from 2 to 10 from top left to bottom right. 

doi:10.1371/journal.pone.0089815.g005 



[25] or cross-validation) resampling, the mutual distances among 
all inferred networks and their distances to the network generated 
by the whole dataset, with the idea that, the smaller the average 
distance, the more stable the inferred network. Similarly, we 
propose two indicators for the distribution of variability of the link 
weight and node degree across the generated networks, providing 
a ranked list of the most stable links and nodes, the least variable 



being the top ranked. The described framework for evaluating the 
stability of the whole network obviously relies on a network 
distance, but it is independent from the chosen metric. As network 
distance we use the Hamming-Ipsen-Mikhailov (HIM) distance 
[26], or its components for demonstration purposes, because it 
represents a good compromise between local (link-based) 
and global (structure-based) measures of network comparison. 
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Table 1. Modularity and density values for 50-nodes 
networks (G,) for an increasing number of modules m. 
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Table 2. G„, networks: range of S$ for different 
reconstruction algorithms and r> = {HIM,H,IM}. 
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Figure 6. G,„ networks: Stability of synthetic networks for different modularity levels. 

doi:1 0.1 371 /journal.pone.008981 5.g006 
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Figure 7. G„, networks: distance between gold standard (HIM) and inferred synthetic networks for different modularity levels. 

doi:1 0.1 371 /journal. pone.0089815.g007 



Moreover, the HIM distance can be easily included in pipelines 
for network analysis [27]. 

We first show the effect of network modularity and the dataset 
sample size on both the stability and the accuracy of the network 
inference process. For this purpose, we create two synthetic 
datasets with a known gold standard. The results are demonstrated 
for several inference algorithm, such as the Algorithm for the 
Reconstruction of Accurate Cellular Networks (ARACNE), 
developed for the reconstruction of gene regulatory networks 
[28], the Context Likelihood of Relatedness (CLR) approach [29] 
and the Weighted Gene Correlation Network Analysis (WGCNA) 
[30]. Then the NetSI indicators are computed on correlation 
networks developed on another ad hoc synthetic dataset. We 
highlight the difference in terms of stability due to the choice of the 
inference algorithm: two basic correlation measures and the 
impact of a permutation-based False Discovery Rate (FDR) filter. 
Finally, we show the use of NetSI measures in a typical 
application, comparing the stability of relevance networks inferred 
on a miRNA microarray dataset with paired tissues extracted from 



a cohort of 241 hepatocellular carcinoma patients [31,32]. The 
data exhibit two phenotypes, one related to disease (tumoral or 
non-tumoral tissues) and one to patient gender (male or female); 
we show that four different networks are obtained, each of 
different stability, and that the reconstruction method is a serious 
source of variability with the smaller data subgroups. Finally we 
validate the analysis on a second hepatacellular carcinoma dataset 
(166 subjects) with good reproducibility. 

All the methods (HIM distance and NetSI indicators) have been 
implemented in the open source R package nettools for the CRAN 
archives, as well as on GitHub at the address https://github.com/ 
MPBA/ nettools. git. For computing efficiency, the software can be 
used on multicore workstations and on high performance 
computing (HPC) clusters. Further technical details and prelim- 
inary experiments with nettools are available in [33]. 

Methods 

Before defining the NetSI family we briefly summarize the 
main definitions and properties of the HIM network distance. 
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Figure 8. Yeast-like simulated data: effect of increasing sample size on network reconstruction stability .Shim- 
inference algorithms are compared. 
doi:1 0.1 371 /journal.pone.008981 5.g008 



Different network 



Moreover, at the end of this section, we provide a short description 
of the network inference approaches used in the following 
experiments. 

HIM network distance 

The HIM distance [26] is a metric for network comparison 
combining an edit distance (Hamming [34,35]) and a spectral one 
(Ipsen-Mikhailov [36]). As discussed in [37], edit distances are 
local, i.e. they focus only on the portions of the network interested 
by the differences in the presence/absence of matching links. 
Spectral distances evaluate instead the global structure of the 
compared topologies, but they cannot distinguish isomorphic or 
isospectral graphs, which can correspond to quite different 
conditions within the biological context. Their combination into 
the HIM distance represents an effective solution to the 
quantitative evaluation of network differences. 



Let N\ and N2 be two simple networks on N nodes, described 
by the corresponding adjacency matrices 

A m 

and with 

Oy\a^j eF, where F = p2 = {0,l} for unweighted graphs and 
F = [0,1] for weighted networks. Denote then by the identity 
/l 0 ... 0\ 
0 1 ■■■ 0 



N x N matrix l N = 



, by ¥ N the unitary NxN 



\0 0 .. 1/ 

matrix with all entries equal to one and by the null NxN 
matrix with all entries equal to zero. Finally, denote by £n the 
empty network with N nodes and no links (with adjacency matrix 
•Kjv) and by Tfi the undirected full network with N nodes and all 
possible N(N — 1) links (whose adjacency matrix is Fai— I/v). 
The definition of the Hamming distance is the following: 
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Figure 9. Yeast-like simulated data: effect of increasing sample size on network reconstruction internal stability S' mM . Different 

network inference algorithms are compared. 

doi:10.1371/journal.pone.0089815.g009 



Hamming^!, A/" 2 )= £ \^f~4 



(2), 



\<i^j<N 



To guarantee independence from the network dimension 
(number of nodes), we normalize the above function by the factor 
rj = Hamming(£fj,J r n) = N{N — 1): 



H(MiMi)-- 



E 



(2) 



When M\ and Mi are unweighted networks, H(M\ Ml) is just 
the fraction of different matching links (over the total number 



N(N — 1) of possible links) between the two graphs. In all cases, 
H(MiM2)s[0,l]> where the lower bound 0 is attained only for 
identical networks A^ ' = A^ and the upper bound 1 is reached 
whenever the two networks are complementary 



A^+A^^^-Jj,- 



/o 
1 

V 1 



networks 
1 ■•■ 
0 ••• 



1\ 



1 



1 ■•• 0/ 

Among spectral distances, we consider the Ipsen-Mikhailov 
distance IM which has been proven to be the most robust in a wide 
range of situations [37,38]. Originally introduced in [36] as a tool 
for network reconstruction from its Laplacian spectrum, the 
definition of the Ipsen-Mikhailov metric follows the dynamical 
interpretation of a N-nodes network as a N-atoms molecule 
connected by identical elastic strings, where the pattern of 
connections is defined by the adjacency matrix of the correspond- 
ing network. In particular the connections between nodes in the 
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Figure 10. Yeast-like simulated data: effect of increasing sample size on network reconstruction accuracy measured as HIM 
distance and its components Hamming (H) and Ipsen-Mikhailov (IM) with respect to the gold standard. Different network inference 
algorithms are compared. 
doi:10.1371/journal.pone.0089815.g010 



network correspond to the bonds between atoms in the dynamical 
system and the adjacency matrix is its topological description. 

We summarize here the mathematical details of the IM 
definition [36]. The dynamical properties of the oscillatory system 
are described by the set of N differential equations 

N 

x; + Aij(xj - xj) = 0 for i = 0, • • • ,N- 1 , (2) 

7 = 1 

where x, are the coordinates of the physical molecules. Since the 
adjacency matrix A depends on the node labeling, we consider 
instead the Laplacian matrix L, which for an undirected network 
is defined as the difference between the degree matrix D (the 
diagonal matrix with vertex degrees as entries) and A: L = D — A. 
L is positive semidefmite and singular [39-42], and its set of 
eigenvalues 0 = Ao <k\ < ■ ■ ■ < Xn-i, t.e. the spectra of the 
associated graph, provide the natural vibrational frequencies CO; 
for the system modeled in Eq. 2: A,- = <»?, with /.o = fUo=0. The 
spectral density p for a graph can be written as the sum of Lorentz 

1 . Let D be a dataset with m samples described by p features, and let C(h, k) = |cor(x/,, x k )\ 
where xj is the j-th feature of D across the m samples and cor is a correlation measure. 

2. Build the standard correlation network N D using the rule a hk = C(h, k) 

3. Build the FDR controlled (at)>value p = 10~ z ) correlation network M|J using the rule 

(C(h,k) if \F?,(h,k)\ < 1 

(Ihk — \ 

I 0 otherwise, 
where the set F^(h, k) is defined as follows 

F%(h, k) = {\cor(ai(xh),n(a>k))\ > C(h, k) : <r«, n e S m ,i = 1 max{10*,m! - 1}} , 

for distinct non trivial aiT^ 1 , where 5„, is the permutation group on m elements. 

Figure 1 1 . Construction of an FDR-corrected correlation network. 

doi:10.1371/journal.pone.0089815.g011 



distributions 



p(o),y) = Ky] ^2 j > 

f^(co-co,) 2 +y 2 



where y is the common width and K is the normalization 
constant defined as 



K- 



(•«> dm 
Aj J° (o)-(o,) 2 +y 2 



so that 



p(a>,y)&a>= 1. The scale parameter y specifies the 



half-width at half-maximum, which is equal to half the inter- 
quartile range. From the above definitions, the spectral distance y 
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Figure 12. The correlation matrix M T used to generate the 
synthetic dataset T. 

doi:1 0.1 371 /journal.pone.008981 5.g01 2 

between two graphs G and H with densities Pg(a>,y) and p H (oj,y) 
can then be defined as 



e y (G,H) = 



The highest value of ty is reached, for a given number of nodes 
N, when evaluating the distance between £jv and J- ft- Defining y 
as the (unique [26]) solution of 



we can now define the normalized Ipsen-Mikahilov distance IM 



\M(G,H)=ey{G,H) = J J°° [p G (co,y) - p„(^fdw , 



so that IM(G,//)e[0,l] with the upper bound attained only for 
(G,H) = (£n,Tn)- 

Finally, the generalized HIM distance is defined by the one- 
parameter family of product metrics linearly combining with a 
factor £e[0, + oo) the normalized Hamming distance H and the 
normalized Ipsen-Mikhailov IM distance, further normalized by 

the factor + £ to set its upper bound to 1 : 



HIM^j ,N 2 ) = -=L= JuiNuNif + HMiNuNi) 2 



generate a kernel function for classification tasks (e.g. on brain 
networks). 

In what follows we will mostiy deal with the case £, = 1 , and omit 
the subscript ^ for brevity. The relative effect of the two 
components is exemplified in Fig. 1A-D. The three small size 
networks (5 nodes) N^,Ng, and Nc in Fig. 1A differ from each 
other in only two edges but and Nc are isomorphic and 
diverse from Ng, as correctly picked up by the HIM distance (see 
table in Fig. 1C). Similarly, HIM, H and IM provide different 
values when four edges are cut from on the larger (50 nodes) G4 
network, at different levels of the graph structure. Larger effects 
are caused by the elimination of the four red edges connecting the 
four submodules with differences up to 10 times larger for IM with 
respect to H (see table in Fig. ID). 

The HIM distance can be represented in the [0,1] x [0,1] 
Hamming/Ipsen-Mikhailov space, where a point P(x,y) repre- 
sents the distance between two networks N\ and N2 whose 
coordinates are x = H(iVi JVi) and y = IM(iVi ,N 2 ) and the norm 
of P is \/T+£ times the HIM distance HIM(7Vj ,N 2 ). The same 
holds for weighted networks, provided that the weights range over 
[0,1]. In Fig. 2 we provide an example of this representation by 
evaluating the HIM distance between networks of four nodes, 
namely networks A, B, E (empty) and F (full) in the left panel of 
Fig. 2. If the Hamming/Ipsen-Mikhailov space is roughly split into 
four quadrants I, II, III, and IV, then two networks whose distance 
is mapped in quadrant I are close both in terms of matching links 
and of structure, while those falling in quadrant III differ with 
respect to both characteristics. Networks corresponding to a point 
in quadrant II have many common links, but different structures, 
while a point in quadrant IV indicates two networks with few 
common links, but with similar structure. 

Full mathematical details about the HIM distance and its two 
components H and IM can be found in [26]. 

The Network Stability Indicators (NetSI) 

The mathematical and operational definition of the four NetSI 
indicators are introduced in Fig. 3. The first two are the stability 
indicators Sg and the internal stability indicator S r s , which concern 
the stability of the whole reconstructed network. The former 
measures the distances between the network inferred on the whole 
dataset against the networks inferred from the resampled subsets. 
The latter measures all the mutual distances within the networks 
inferred from the resampled subsets. The other two indicators, the 
edge weight stability indicator S w and the node degree stability 
indicator Sj, concern instead the stability of the single links and 
nodes, in terms of mutual variability of their respective weight and 
degree. In all cases, smaller indicator values correspond to more 
stable objects. 

We adopt (5 = HIM, except for the first experiment where we 
show also the stability for S = H and S = IM. As the HIM distance 
is defined also on directed networks, the extension of the NetSI 
family to the directed case is straightforward. A graphical 
representation of the procedure is provided in Fig. 4. For all 
experiments reported in this paper, we used n = s — 1, r=l (leave- 
one-out stability, LOO for short), and 20 different instances of k- 
fold cross validation (discarding the test portion) for A: = 2,4, 10 (k2, 
s(k-l) 



k4 and £10), and thus n- 



and r = 20k. 



Obviously, HIM 0 = H and lim HIM C - = IM. For example, 

the flexibility introduced by £ can be used to focus attention more 
on structure than on local editing changes when HIMf is used to 



Stability of network inference algorithms 

As a first application, we test the difference in stability of the 
reconstruction process for a set of alternative network co- 
expression inference algorithms. 
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1C 



1-7 



re 




Figure 13. Synthetic dataset T: correlation networks inferred by using (A) WGCNA [W], (B) (absolute) Pearson with FDR correction 

at /i-value 10~ 4 [C(10~ 4 )] and (C) MIC [M]. Node label i corresponds to feature j), node size is proportional to node degree and link colors identify 

different classes of link weights. 

doi:10.1371/journal.pone.0089815.g013 



The most famous representative of the correlation-based 
approaches is surely the Weighted Gene Correlation Network 
Analysis (WGCNA) [30,43]. In this case the co-expression 
similarity is defined as a function of the absolute correlation. We 
adopt as similarity score: (i) the simple absolute Pearson 
correlation (labelled as "cor"), (ii) a more sophisticated version 
with soft-thresholding, i.e., the similarity is defined as a power of 
the absolute correlation (we adopt the default value six as in the 
WGCNA R package), or (iii) the biweight midcorrelation ("bicor" 
for short) [30,44], which is more robust to outliers than the 
Pearson correlation, and (iv) the Maximal Information Coefficient 
(labeled as MIC). MIC is a recent association measure based on 
mutual information and belongs to the Maximal Information- 
based Nonparametric Exploration (MINE) statistics [44—48] . In all 
cases we obtain a weighted network with link strength ranging 
from 0 to 1. 

The Topological Overlap Measure (TOM) replaces the original 
co-expression similarity with a measure of interconnectedness 



(between pairs of nodes) based on shared neighbors [30,43]. TOM 
can be seen as a filter for cutting away weak connections, thus 
leading to more robust networks than WGCNA. 

The Context Likelihood of Relatedness (CLR) approach [29] 
scores the interactions by using the mutual information between 
the corresponding gene expression levels, coupled with an adaptive 
background correction step. Although suboptimal if the number of 
nodes is much larger than the number of variables, it was observed 
that CLR performs well in terms of prediction accuracy and some 
CLR predictions in literature were recently validated experimen- 
tally [49]. 

The Algorithm for the Reconstruction of Accurate Cellular 
Networks (ARACNE) is another approach relying on mutual 
information, which was originally developed for inferring regula- 
tory networks of mammalian cells [28]. It starts with a graph 
where each pair of nodes are connected if their association is above 
a chosen threshold. In order to avoid the false positive problem, 
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Table 3. Synthetic dataset T: stability (Shim and s myd on networks inferred by MIC, WGCNA, and CORFDR(p) 



A 


k 


Shim 


CI (min; max) 


Range (min; max) 


J HIM 


CI (min; max) 


Range (min; max) 


MIC 


LOO 


0.008 


(0.007; 0.008) 


(0.004; 0.011) 


0.008 


(0.008; 0.008) 


(0.003; 0.014) 


MIC 


A10 


0.052 


(0.051; 0.052) 


(0.041; 0.067) 


0.021 


(0.021; 0.021) 


(0.014; 0.036) 


MIC 


A4 


0.055 


(0.054; 0.057) 


(0.040; 0.071) 


0.031 


(0.031; 0.031) 


(0.022; 0.045) 


MIC 


kl 


0.139 


(0.134; 0.142) 


(0.112; 0.158) 


0.047 


(0.047; 0.048) 


(0.035; 0.067) 


WGCNA 


LOO 


0.005 


(0.005; 0.006) 


(0.001; 0.015) 


0.008 


(0.008; 0.008) 


(0.002; 0.023) 


WGCNA 


A-10 


0.021 


(0.020; 0.022) 


(0.011; 0.040) 


0.028 


(0.028; 0.028) 


(0.012; 0.064) 


WGCNA 


A4 


0.039 


(0.037; 0.041) 


(0.020; 0.062) 


0.046 


(0.046; 0.047) 


(0.025; 0.088) 


WGCNA 


kl 


0.070 


(0.065; 0.076) 


(0.037; 0.108) 


0.070 


(0.069; 0.071) 


(0.042; 0.117) 


CORFDR(10 -2 ) 


LOO 


0.01 5 


(U.U 1 J3, U.U 1 O) 


lr\ (vie. n no c\ 
(U.UUD, U.UoDJ 


0.01 7 


(U.U 1 /, U.U 1 /) 


m urn . r\ r\A ~7\ 
(U.UU I , U.U4/ J 


CORFDR(10~ 2 ) 


A 1U 


0.023 


(n m->. n 
tu.u^z, U.UZDj 


(n nn"7- n cyia\ 
(U.UU/, U.U/4J 


0.028 


(n m7- n mo\ 
\\J.KJZ/, U.UZO) 


in nm- n 1 mi 

tU.UUz, U. I UZ) 


CORFDR(10~ 2 ) 


/C4 


0.031 


m mo. n 
(U.U^tS, U.UJ4J 


tu.u i u, u.uoyj 


0.034 


In no A. n no c\ 
IU.UJ4, U.UjSDJ 


tu.uuo, u.uyoj 


CORFDR(10~ 2 ) 


k _ 


0.045 


(U.Ujy, U.U54J 


(n m a . n 1 m\ 
(U.U 1 4, U. I U/J 


0.050 


In nAO. n nc i \ 
(U.U4b, U.UD 1 j 


m c\c\£L. r\ 1 c~t\ 
(U.UUo, U. I dZ) 


CORFDR(510~ 3 ) 


LOO 


0.029 


m mo. n no 1 \ 
(U.U^tS, U.Uo I ) 


(U.UUjS, U.U40J 


0.01 8 


IU.U 1 o, U.U 1 a) 


tU.UUU, U.UD4J 


CORFDR(510" 3 ) 


A-10 


0.025 


(0.024; 0.027) 


(0.004; 0.054) 


0.024 


(0.024; 0.024) 


(0.001; 0.083) 


CORFDR(510" 3 ) 


A4 


0.025 


(0.023; 0.028) 


(0.006; 0.056) 


0.032 


(0.032; 0.033) 


(0.004; 0.099) 


CORFDR(510- 3 ) 


A-2 


0.033 


(0.028; 0.038) 


(0.008; 0.070) 


0.044 


(0.042; 0.045) 


(0.002; 0.121) 


CORFDRflO" 4 ) 


LOO 


0.010 


(0.008; 0.013) 


(0.000; 0.044) 


0.013 


(0.013; 0.014) 


(0.000; 0.045) 


CORFDR(10 -4 ) 


A10 


0.010 


(0.009; 0.012) 


(0.000; 0.053) 


0.014 


(0.014; 0.015) 


(0.000; 0.055) 


CORFDRflO" 4 ) 


A4 


0.009 


(0.007; 0.012) 


(0.001; 0.049) 


0.014 


(0.014; 0.014) 


(0.001; 0.054) 


CORFDRflO -4 ) 


A2 


0.009 


(0.007; 0.013) 


(0.001; 0.031) 


0.014 


(0.013; 0.015) 


(0.001; 0.040) 



Indicators Shim an d Si. 



, 95% 



HIM °HIM 

values of data subsampling. 
doi:1 0.1 371 /journal.pone.008981 



Student bootstrap confidence intervals and range for different instances of the MIC, WGCNA and CORFDR(^) networks for different 
5.W03 



that usually affects co-expression networks, we then apply the Data 
Processing Inequality (DPI) procedure for removing the weakest 
edge of each triplet, thus pruning the majority of undirected links. 

A unique interface to all the mentioned algorithms is integrated 
in the stability analysis tools in the nettooh package, based on their 
Bioconductor and CRAN implementations: minet for ARACNE 
and CLR, WGCNA for WGCNA, TOM and bicor, and minerva for 
MIC. 

Results and Discussion 

Stability is modularity invariant 

We demonstrate the invariance of the NetSI family with respect 
to network modularity in a controlled situation. We show that the 
proposed stability evaluation framework is not affected by various 
network structures for nine reconstruction algorithms. Moreover, 
we demonstrate that this property is maintained both if we adopt 
the HIM metric for the Ss indicator computation and we use the 
two components H and IM separately. 

Data generation. We created a set of networks G m with 50 
nodes each with m (where m ranges from 1 to 1 0) fully connected 
subgroups, which are linked to each other with a single edge. For 
m=l we obtain a fully connected network (without loops), while 
the resulting networks for m> 1 are displayed in Fig. 5. For each 
network G,„ we report its modularity value and density in Tab. 1 . 

The simulated gene expression values corresponding to the 
networks G m are generated loading the corresponding adjacency 
matrices in the Gene Net Weaver (GNW) simulator [50]. 
Specifically, the tool is used to create of simulated transcription 
datasets after a random initialization of each network's regulatory 



dynamics through a pre-loaded kinetic model [23]. Moreover it is 
possible to generate a steady-state dataset or a set of time series, 
which describes the network response to a perturbation, followed 
by perturbation removal until the steady state is reached. Thus, we 
chose to generate in one shot 50 time-series (one for each sample) 
with default parameter settings and to consider only the initial time 
point, since t = Q corresponds to the wild-type steady state. 
Summarizing, we generated 10 synthetic datasets having a 
simulated expression level for 50 "genes" and 50 "samples". 

Results. We inferred networks from the 10 datasets with nine 
algorithms: ARACNE, CLR, cor, TOM, bicor, WGCNA and 
MIC, where the last two were also used with a permutation-based 
FDR filter (for details, see Subsection "FDR control effect on 
correlation networks"). 

The stability analysis with three possible network metrics (HIM, 
H and IM) on networks inferred with the nine mentioned 
approaches is reported in Fig. 6. In all cases, the stability Sg varies 
less than 0.06 across different modularity values, as detailed in 
Tab. 2. Hence, the stability indicator is not affected by different 
modular structures. However, reconstruction accuracy depends on 
modularity (or density), as shown by a comparison with the gold 
standard (Fig. 7), in which a lower distance from the gold standard 
is found for sparser networks for all methods. 

Inference on synthetic yeast-like networks 

We investigated the behavior of the NetSI stability indicators for 
different sample sizes on a yeast-like dataset, again simulated by 
GNW. 

Data Description. We considered a subnetwork of the Yeast 
transcriptional regulatory network available in GNW, namely the 
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Figure 14. Synthetic dataset / : representation of .Shim and S' um stability indicators (with confidence intervals) for different 
instances of the FDR-corrected correlation networks, CORFDR(10 2 ), CORFDR(5 10 3 ), and CORFDR(K) '), WGCNA and MIC 
networks on the dataset T and for different values of data subsampling. 

doi:10.1371/journal.pone.0089815.g014 



InSilkoSizelOO-Yeast2 dataset with 100 nodes, originally a 
DREAM3 benchmark, generating 100 samples with default 
parameter configuration, including noise level, for wild-type 
steady state (the synthetic dataset Y). 

Results. We randomly extracted 10 subsets of different 
sample size N in {10,20,30,40,50,60,70,80,90,100}, replicating 
the subset extraction procedure 50 times for each A^. For each 
combination of resampling, we inferred the network with the 
same nine algorithms used in the previous experiment. 

As a general trend, stability decreases for larger sample size (see 
Fig. 8). The Shim stability curves for the two popular methods 
ARACNE and CLR drop quickly after 20% of the sample size, 
improving over Pearson and bicor. TOM and WGCNA are more 
stable but require at least 50% of the data. The standard MIC- 
based method with the default parameter (a = 0.6) is much 
smoothed by the FDR correction. Overall, the FDR corrected 
methods are the most stable even for small samples. TOM and 



WGCNA have the best internal stability S Hm (Fig. 9), followed by 
the FDR-corrected methods. 

Given that a gold standard is available for the simulated data in 
Y, we can compare the stability performance with reconstruction 
accuracy in terms of HIM and its components for the nine 
methods (Fig. 10). On this dataset, accuracy is independent of 
sample size, except for WGCNA and TOM which have an 
optimal range (Af = {30,40}), given by their soft thresholding 
procedures. Note that the source Yeast subnetwork is unweighted 
while all methods return a weighted network: a 10~ 3 threshold 
was thus applied to binarize the reconstructed network before 
computing the distances. Hence, MIC, cor and bicor perform 
badly as they lack an internal thresholding procedure; the FDR 
corrected methods have better but still mediocre results, slightly 
improved by CLR. On this dataset, WGCNA-FDR yields sparse 
networks (less than 20 edges) with small Hamming distances from 
the gold standard as they both have low density; however they 
have strongly different spectral structure from the gold standard, 
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Table 4. Synthetic dataset T: top ranked links for edge 
weight stability S w on networks inferred by WGCNA, 
CORFDR(p = l(T 4 ), and MIC. 





WGCNA 




CORFDRU0 


) 


MIC 




fi-fj 


Sy, 


ft -ft 


s w 


fi-fj 




1 - 3 


0.03 


1 - 3 


0.03 


3-4 


0.20 


2-3 


0.04 


3-4 


0.04 


2-3 


0.20 


1 - 2 


0.04 


2-3 


0.04 


1 - 3 


0.21 


1 - 4 


0.04 


1 - 4 


0.05 


3-5 


0.22 


3-4 


0.04 


3-5 


0.05 


1 - 2 


0.23 


2-4 


0.04 


1 - 2 


0.05 


1 - 5 


0.25 


4-5 


0.04 


2-4 


0.05 


1 - 4 


0.26 


2-5 


0.05 


2-5 


0.06 


4-5 


0.27 


1 - 5 


0.05 


4-5 


0.06 


7-10 


0.28 


3-5 


0.05 


1 - 5 


0.06 


7-8 


0.29 


6-8 


0.08 


6-8 


0.08 


6-8 


0.29 


8-10 


0.10 


7-8 


0.09 


6-10 


0.30 


7-8 


0.11 


8-10 


0.10 


1 - 20 


0.31 


7-9 


0.11 


8-9 


0.11 


2-4 


0.31 


8-9 


0.11 


6-7 


0.11 


8-10 


0.31 


9-10 


0.11 


7-10 


0.12 


2-5 


0.32 


6-7 


0.11 


7-9 


0.12 


9-10 


0.32 


7-10 


0.12 


9-10 


0.13 


7-20 


0.33 


6-10 


0.13 


6-9 


0.13 


14 - 16 


0.33 


6-9 


0.14 


6-10 


0.15 


5-17 


0.35 


11-13 


0.33 






6-7 


0.35 


14 - 15 


0.41 






11-17 


0.36 


13 - 14 


0.46 






6-9 


0.36 


12 - 13 


0.58 






1 - 10 


0.37 


12 - 15 


0.60 






10-11 


0.37 


11-14 


0.62 






10-20 


0.37 


13 - 15 


0.71 






4-17 


0.37 


11-15 


0.78 






2-8 


0.37 


14 - 18 


0.78 






4-10 


0.37 


3-11 


0.83 






6-13 


0.37 


5-11 


0.83 






2-14 


0.37 


1-11 


0.84 






9-11 


0.38 


4-11 


0.85 






15 - 16 


0.38 


3-10 


0.87 






15 - 17 


0.38 


5-16 


0.89 






7-13 


0.39 


8-17 


0.89 






9-18 


0.39 


2-11 


0.91 






12 - 19 


0.39 


8-12 


0.91 






6-18 


0.39 


4-13 


0.91 






8-9 


0.39 


1 - 13 


0.93 






4-18 


0.39 


3-13 


0.93 






16 - 17 


0.39 


8-13 


0.94 






4-19 


0.39 


9-17 


0.94 






16 - 19 


0.39 


1 - 16 


0.95 






7-19 


0.40 


1 - 10 


0.95 






5-8 


0.40 


14 - 16 


0.97 






14 - 15 


0.40 



Table 4. Cont. 



WGCNA 




CORFDR(10 4 ) 


MIC 




fi-fi 


S„. 


fi-fj s w 


fi-fj 


S w 


11-12 


0.98 




4-11 


0.40 


12 - 16 


0.98 




7-9 


0.41 


2-13 


0.99 




13 - 19 


0.41 



The links are ordered by S w across all 20 resamplings of HO cross validation, for 
the three algorithms; the table includes the top 50 links for WGCNA and MIC, 
and all 20 links found by CORFDR(p = 10~ 4 ). 
doi:1 0.1 371 /journal.pone.008981 5.t004 



as captured by the IM component. Finally, ARAGNE achieves fan- 
stability here as well as the highest accuracy. 

The effect of FDR control on stability 

We aim to assess differences in the stability of correlation 
networks inferred by an ad-hoc set of synthetic signals similar to 
expression data whenever inference is computed with or without 
False Discovery Rate (FDR) control. 

FDR control for correlation networks. For an introduction 
to FDR methods, see for instance [51]. The procedure considered 
in this paper is explicidy described in Fig. 1 1 . The FDR control 
defines a rule for choosing which edges to trust, and thus to keep, 



Table 5. Synthetic dataset T: nodes ranked by stability 
degree Sj on networks inferred by WGCNA, 
CORFDR($.j = l(r 4 ), and MIC. 
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fi 


S d 


fi 


S d 
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3 
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1 
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1 
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9 


0.23 


20 
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2 
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3 


0.03 


10 


0.09 


5 
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0.04 


5 
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0.24 
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2 
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6 


0.24 
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17 
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20 
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0.11 
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0.09 
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0.11 


15 
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0.09 


13 


0.11 


12 


0.45 


10 


0.09 


11 


0.11 


14 


0.48 


4 


0.13 


16 


0.11 


18 


0.55 


15 


4.42 


12 


0.11 


16 


0.60 


14 


7.05 


7 


0.11 


17 


0.68 


12 


22.82 


6 


0.12 


20 


0.70 


13 


26.05 


14 


0.13 


19 


1.15 


11 


41.83 


18 


0.13 



The 20 nodes are ordered by S t / across all 20 resamplings by HO cross 
validation. (*) indicates that range and mean are both zero. 
doi:1 0.1 371 /journal.pone.008981 5.t005 
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Figure 15. HCC-B dataset: CLR networks in the hairball representation inferred from the 4 subsets (A) Male Tumoral (MT), (B) Male 
non Tumoral (MnT), (C) Female Tumoral (FT), and (D) Female non Tumoral (FnT). Links are thresholded at weight 0.1, node position is fixed 
across the four networks, node dimension is proportional to the degree and edge width is proportional to link weight. 
doi:10.1371/journal.pone.0089815.g015 



during the network reconstruction phase. An edge weight is given 
by the correlation coefficient between the signal values of two 
nodes h,k: a^ = C{h,k) = \cor(xf l ,x\ c )\, where cor is a correlation 
function. If m is the sample size, we wish to estimate the chance 
that a random permutation of the expression values may give a 
correlation value higher than C(h,k), thus removing the edge 
when this chance is larger than a permutational p- value p = 10 -z , 
where z is a chosen level of significance (typically z = 3). In 
practice, this test is implemented by counting how many times 
C(h,k) is smaller than the correlation between ff(x/,) and t{xic), 
where a and t are distinct permutations of m objects. We consider 
here the absolute Pearson correlation at different p levels 



CORFDRO), when compared with WGCNA [30,43] with 
default thresholding parameter, as well as with the Maximal 
Information Coefficient (MIC), a non-linear correlation measure 
defined within the Maximal Information-based Nonparametric 
Exploration (MINE) statistics [45-47]. Note that the FDR 
correction procedure can be implemented with different correla- 
tion measures, such as WGCNA-FDR and MIC-FDR considered 
in the previous section. 

Data generation. In this example, the correlation networks 
are inferred from a dataset T of 100 samples of 20 features 
{/i> - ■ ■ i/2o}i via the Choleski decomposition (using the chol R 
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Figure 16. HCC-B dataset: CLR networks in the hiveplot representation inferred from the 4 subsets (A) Male Tumoral (MT), (B) Male 
non Tumoral (MnT), (C) Female Tumoral (FT), and (D) Female non Tumoral (FnT). Each plot consists of six axes with lines connecting points 
lying on the axes themselves. The axis ai pointing upwards collects all the nodes with (unweighted) degree 0 or 1 ; 012, the next axis moving clockwise, 
is a copy of i\; the following two axes include all nodes with degree 2, while on the remaining two axes lie all nodes with degree 3 or more. Different 
colors indicate different degree. Nodes on axes are ranked by degree. Lines between two consecutive axes show the network's edges and edge color 
is inherited by the node with smaller degree. Note the absence of links between nodes of degreee 1 and 2 in the FT case, and the smaller amount of 
connections between higher degree nodes in the MnT case with respect to the other three cases. 
doi:10.1371/journal.pone.0089815.g016 
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Figure 17. HCC-B dataset: mutual HIM distances for CLR inferred networks. Comparison of the four networks Male Tumoral (MT), Male non 
Tumoral (MnT), Female Tumoral (FT) and Female non Tumoral (FnT) reconstructed from the whole corresponding subsets in Tab. (A) and in the 
derived 2D multidimensional scaling plot (B). 
doi:1 0.1 371/journal.pone.008981 5.g01 7 
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Table 6. HCC-B dataset: S H im anc ' ^him statistics for FnT network. 



Algorithm 


k 


Shim 


Shim CI 
(min; max) 


Shim Range 
(min; max) 


9' 

J HIM 


Shim CI 
(min; max) 


Shim Ran 9 e 
(min; max) 


ARACNE 


LOO 


0.007 


(0.006; 0.009) 


(0.002; 0.018) 


0.008 


(0.008; 0.009) 


(0.001; 0.045) 


ARACNE 


k2 


0.037 


(0.030; 0.045) 


(0.007; 0.097) 


0.032 


(0.031; 0.033) 


(0.002; 0.179) 


ARACNE 


k4 


0.024 


(0.021; 0.027) 


(0.004; 0.060) 


0.022 


(0.022; 0.023) 


(0.002; 0.134) 


ARACNE 


klO 


0.013 


(0.012; 0.014) 


(0.003; 0.036) 


0.014 


(0.013; 0.014) 


(0.000; 0.081) 


CLR 


LOO 


0.022 


(0.017; 0.027) 


(0.003; 0.048) 


0.030 


(0.028; 0.032) 


(0.001; 0.094) 


CLR 


kl 


0.094 


(0.071; 0.117) 


(0.006; 0.257) 


0.1 19 


(0.113; 0.124) 


(0.006; 0.391) 


CLR 


k\ 


0.062 


(0.054; 0.072) 


(0.005; 0.203) 


0.080 


(0.078; 0.082) 


(0.003; 0.307) 


CLR 


k 1 0 


0.032 


(0.029; 0.035) 


(0.002; 0.093) 


0.045 


(0.044; 0.045) 


(0.000; 0.179) 


cor 


LOO 


0.021 


(0.017; 0.026) 


(0.006; 0.051) 


0.031 


(0.030; 0.032) 


(0.002; 0.122) 


cor 


k2 


0.1 17 


(0.104; 0.133) 


(0.059; 0.221) 


0.145 


(0.143; 0.147} 


(0.016; 0.431) 


cor 


k4 


0.070 


(0.065; 0.078) 


(0.023; 0.172) 


0.093 


(0.092; 0.094) 


(0.008; 0.347) 


cor 


kl 0 


0.038 


(0.036; 0.041) 


(0.014; 0.120) 


0.053 


(0.053; 0.054) 


(0.000; 0.279) 


bicor 


LOO 


0.026 


(0.022; 0.031) 


(0.012; 0.057) 


0.036 


(0.035; 0.037) 


(0.004; 0.136) 


bicor 


k2 


0.117 


(0.105; 0.132) 


(0.072; 0.227) 


0.151 


(0.148; 0.153) 


(0.015; 0.445) 


bicor 


k4 


0.076 


(0.070; 0.083) 


(0.033; 0.180) 


0.100 


(0.099; 0.101) 


(0.007; 0.367) 


bicor 


kl 0 


0.044 


(0.041; 0.047) 


(0.019; 0.126) 


0.060 


(0.059; 0.060) 


(0.000; 0.286) 


WGCNA 


LOO 


0.015 


(0.013; 0.018) 


(0.005; 0.033) 


0.019 


(0.019; 0.020) 


(0.003; 0.078) 


WGCNA 


k2 


0.099 


(0.086; 0.1 14) 


(0.035; 0.198) 


0.094 


(0.092; 0.096) 


(0.017; 0.341) 


WGCNA 


k4 


0.048 


(0.043; 0.055) 


(0.015; 0.1 15} 


0.057 


(0.056; 0.057) 


(0.007; 0.251) 


WGCNA 


kl 0 


0.026 


(0.025; 0.028) 


(0.006; 0.081) 


0.033 


(0.033; 0.034) 


(0.000; 0.187) 


MINE 


LOO 


0 035 


in rni • n n^Qi 


in m R- n n^Ai 

IU.U I O, U.UJnJ 


0 040 


in nAn- n nil I 

^U.UtU, U.UH 1 / 


In nn3- n r\Qf,\ 


MINE 


k2 


0.138 


(0.125; 0.157) 


(0.084; 0.277) 


0.149 


(0.146; 0.151) 


(0.020; 0.482) 


MINE 


k4 


0.150 


(0.138; 0.161) 


(0.050; 0.259) 


0.105 


(0.105; 0.106) 


(0.007; 0.335) 


MINE 


klO 


0.067 


(0.064; 0.071) 


(0.029; 0.138) 


0.067 


(0.066; 0.067) 


(0.000; 0.253) 


TOM 


LOO 


0.020 


(0.017; 0.025) 


(0.007; 0.047) 


0.025 


(0.024; 0.026) 


(0.003; 0.108) 


TOM 


k2 


0.133 


(0.115; 0.154) 


(0.046; 0.271) 


0.117 


(0.114; 0.119) 


(0.011; 0.467) 


TOM 


k4 


0.065 


(0.057; 0.075) 


(0.017; 0.161) 


0.073 


(0.072; 0.074) 


(0.010; 0.348) 


TOM 


klO 


0.035 


(0.033; 0.038) 


(0.007; 0.116) 


0.043 


(0.043; 0.044) 


(0.000; 0.268) 



Values of the indicators Shim and S^ IM together with bootstrap confidence intervals and range for the inferred networks for different values of data subsampling. 
doi:1 0.1 371 /journal.pone.008981 5.t006 



weak links, while CORFDR correctly selects only links within the 
two disjoint sets of nodes {f, ■: 1<;<5} and {/]■ : 6<i< 10}, 
corresponding to the strongest correlations in the matrix Mr- 

Note that although both WGCNA and CORFDR(l(r 4 ) 
employ cor internally, the two algorithms can lead to different 
results. Indeed the soft thresholding procedure in WGCNA [43] 
defines weights as a,j = \cor(Xi,Xj)f for /? = 6, while for 
CORFDR(l(r 4 ) is aij = \cor(Xi,Xj)\ if \F$.{ij)\<\, according to 
the definition in Fig. 1 1 . 

Stability estimates are also less variable with the FDR corrected 
methods. We compared k-{o\A cross-validation estimates of Shim 
and Shim ^ or ^ ve networks, with A; =1,2,4,10. Results are 
presented in Tab. 3 and displayed in Fig. 14. On this dataset, 
Shim ranges over [0.001,0.108] for WGCNA, [0.004,0.158] for 
MIC, and only over [0.000,0.053] for CORFDR(10~ 4 ). Note that 
estimates are both smaller and less variable for the smallest p- 
value, as noise effects are filtered. Hence, the use of a FDR control 
procedure helps stabilize the correlation based inference proce- 
dure, improving the performance of WGCNA, already one of the 
more robust options for real data [52]. 



function) of its correlation matrix Mj, randomly generated 
according to the following three constraints: 



cortfij})'. 



/0.9 for \<ij<5 

0.7 for 6 <ij<\0 
\0.4 for 11 <y<15 , 



where cor is the Pearson correlation. The correlation matrix 
Mj is plotted in Fig. 12: the correlation values in the three groups 
defined by the above constraints represent true relations between 
the variables, while all other smaller correlation values are due to 
the underlying random generation model for Mj. 

Results. Starting from the dataset T, we built five correlation 
networks, using MIC, WGCNA, and CORFDR(p) with p-values 
p = 10- 2 ,5-10-\l0- 4 . 

The three networks displayed in Fig. 13(A-C) were inferred with 
WGCNA, CORFDR(10" 4 ) and MIC respectively. WGCNA and 
MIC generate two fully connected networks with a majority of 
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Table 7. HCC-B dataset: Shim and S{ ilM statistics for FT network. 



■Shim CI Shim Range Shim CI Shim Range 

Algorithm k S H im (min; max) (min; max) .S'fn M (min; max) (min; max) 



ARACNE 


LOO 


0.007 


(0.006; 0.010) 


(0.001; 0.024) 


0.009 


(0.009; 0.010) 


(0.001; 0.053) 


ARACNE 


k2 


0.032 


(0.024; 0.053) 


(0.008; 0.221) 


0.036 


(0.034; 0.039) 


(0.003; 0.405) 


ARACNE 


k4 


0.016 


(0.014; 0.018) 


(0.005; 0.054) 


0.019 


(0.019; 0.020) 


(0.002; 0.137) 


ARACNE 


k10 


0.012 


(0.010; 0.013) 


(0.002; 0.061) 


0.014 


(0.014; 0.014) 


(0.000; 0.120) 


CLR 


LOO 


0.022 


(0.016; 0.032) 


(0.002; 0.093) 


0.032 


(0.030; 0.035) 


(0.001; 0.143) 


CLR 


k2 


0.069 


(0.056; 0.082) 


(0.006; 0.154) 


0.089 


(0.084; 0.093) 


(0.005; 0.250) 


CLR 


k4 


0.057 


(0.049; 0.066) 


(0.004; 0.190) 


0.078 


(0.076; 0.080) 


(0.003; 0.305) 


CLR 


klO 


0.040 


(0.037; 0.044) 


(0.002; 0.177) 


0.054 


(0.054; 0.055) 


(0.000; 0.143) 


cor 


LOO 


0.019 


(0.016; 0.024) 


(0.008; 0.044) 


0.028 


(0.028; 0.029) 


(0.003; 0.105) 


cor 


k2 


0.122 


(0.111; 0.138) 


(0.079; 0.221) 


0.144 


(0.142; 0.147) 


(0.014; 0.346) 


cor 


k4 


0.067 


(0.063; 0.073) 


(0.036; 0.141) 


0.088 


(0.087; 0.088) 


(0.007; 0.274) 


cor 


klO 


0.037 


(0.036; 0.039) 


(0.019; 0.083) 


0.051 


(0.051; 0.051) 


(0.000; 0.196) 


bicor 


LOO 


0.022 


(0.019; 0.027) 


(0.012; 0.049) 


0.031 


(0.031; 0.032) 


(0.004; 0.108) 


bicor 


k2 


0.116 


(0.107; 0.130) 


(0.073; 0.217) 


0.148 


(0.146; 0.150) 


(0.012; 0.364) 


bicor 


k4 


0.069 


(0.065; 0.074) 


(0.038; 0.137) 


0.092 


(0.091; 0.092) 


(0.009; 0.282) 


bicor 


k10 


0.040 


(0.039; 0.042) 


(0.022; 0.085) 


0.055 


(0.055; 0.055) 


(0.000; 0.202) 


WGCNA 


LOO 


0.010 


(0.008; 0.013) 


(0.003; 0.025) 


0.013 


(0.013; 0.014) 


(0.002; 0.070) 


WGCNA 


k2 


0.109 


(0.095; 0.124) 


(0.026; 0.194) 


0.070 


(0.068; 0.072) 


(0.016; 0.269) 


WGCNA 


k4 


0.043 


(0.039; 0.049) 


(0.008; 0.099) 


0.039 


(0.038; 0.039) 


(0.007; 0.182) 


WGCNA 


k10 


0.022 


(0.020; 0.023) 


(0.005; 0.066) 


0.024 


(0.024; 0.024) 


(0.000; 0.141) 


MINE 


LOO 


0.031 


(0.028; 0.034) 


(0.016; 0.051) 


0.032 


(0.031; 0.032) 


(0.003; 0.095) 


MINE 


k2 


0.138 


(0.128; 0.149) 


(0.094; 0.216) 


0.129 


(0.128; 0.131) 


(0.007; 0.321) 


MINE 


k4 


0.201 


(0.194; 0.207) 


(0.132; 0.257) 


0.088 


(0.088; 0.089) 


(0.005; 0.243) 


MINE 


klO 


0.077 


(0.075; 0.080) 


(0.040; 0.138) 


0.054 


(0.054; 0.054) 


(0.000; 0.193) 


TOM 


LOO 


0.015 


(0.011; 0.019) 


(0.004; 0.038) 


0.018 


(0.017; 0.019) 


(0.002; 0.105) 


TOM 


k2 


0.148 


(0.128; 0.170) 


(0.023; 0.269) 


0.092 


(0.090; 0.095) 


(0.012; 0.386) 


TOM 


k4 


0.060 


(0.053; 0.069) 


(0.011; 0.143) 


0.053 


(0.052; 0.054) 


(0.006; 0.272) 


TOM 


klO 


0.031 


(0.029; 0.033) 


(0.006; 0.097) 


0.033 


(0.033; 0.033) 


(0.000; 0.212) 


Values of the indicators Shim anc ' ^him together 
doi:1 0.1 371 /journal.pone.008981 5.t007 


with bootstrap confidence intervals and range for the inferred networks for different values of data subsampling. 



Results for weight stability and degree stability are listed in Tab. 
4 and Tab. 5, respectively, comparing the rankings for WGCNA, 
CORFDR(lCT 4 ) and MIC for a HO resampling. The stability 
indicators give information consistent with the structure of the 
starting correlation matrix Mj\ inference by WGCNA (Fig. 13 A) 
exactly reconstructs the block structure with three subnetworks 
(blue: Corr~ 0.9, green Corr~ 0.7, orange: Corr~ 0.4), while the 
FDR corrected version (Fig. 13 B) selects only two subnetworks 
corresponding to the two top correlated feature blocks of Fig. 12. 
In the WGCNA case, the top 20 most stable links (Tab. 4) are 
those of the two cliques F\,5 = {fi'- l<i<5} and 
2*6,10 = {fi '■ 6<,i<, 10} with largest correlation values in Mj. 
The correct block structure is found by CORFDR(10~ 4 ) with 
approximately the same values of S w found by the WGCNA 
network, where minor differences are due to the different 
thresholding procedures. 

The ^11,15 = {ft : 11</<15} variables (mutual correlation of 
about 0.3 imposed by design of Mj) are also mosdy top ranked 
links for WGCNA, but with larger instability values (0.33-0.78 vs. 
0.03-0.14). The remaining links are the least stable, with S n . values 
always larger than 0.83: they are the randomly correlated links of 



Mj. Similar but not identical results are found for the network M, 
as expected given that the MIC statistic aims at detecting generic 
associations between variables and it is expected to have reduced 
statistical power with low sample sizes. The structure of the 
network (Fig. 13 C) does not reflect the design linear correlation 
structure. Indeed, several links are ranked differently as the 
expected: although many links in the F\$ and 2*6,10 groups are 
highly ranked, some of them can also be found in much lower 
positions {e.g. 6-7, 6-9, or 7-9 are ranked lower than 20; 7-10, 7-8 
are ranked higher than 2-A, 2-5; 1-20, 7-20, 14-16, and 5-17 are 
ranked within the top 20 links). 

Similar considerations hold for the ranking of the most stable 
nodes: for WGCNA, the top-ranked nodes are the F\ s and the 
2*6,10 (with similar Sj values); those in 2*11,15 come next, leaving the 
remaining five as the least stable with higher Sd values. These 
nodes are trivially the most stable for CORFDR(10~ 4 ) as they are 
never wired to any other node in any of the resampling and thus 
their Sj values are void. 

The nodes 2*1,5 U2*6,l0 then follow in the ranking with small 
associated values, and the nodes F\\ t i$ close the standing with 
definitely higher values. In fact, although the nodes 2*ii 15 have 
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Table 8. HCC-B dataset: S H m and S^im statistics for MnT network. 



■Shim CI Shim Range -Shim CI Shim Range 

Algorithm k Shim (min; max) (min; max) S H im (min; max) (min; max) 



ARACNE 


LOO 


0.002 


(0.002; 0.002) 


(0.001; 0.004) 


0.002 


(0.002; 0.002) 


(0.000; 0.011) 


ARACNE 


k2 


0.014 


(0.011; 0.016) 


(0.003; 0.033) 


0.012 


(0.012; 0.012) 


(0.002; 0.056) 


ARACNE 


k4 


0.007 


(0.006; 0.008) 


(0.003; 0.022) 


0.008 


(0.007; 0.008) 


(0.001; 0.046) 


ARACNE 


klO 


0.005 


(0.004; 0.005) 


(0.002; 0.011) 


0.005 


(0.005; 0.005) 


(0.001; 0.027) 


CLR 


LOO 


0.002 


(0.002; 0.002) 


(0.000; 0.009) 


0.003 


(0.003; 0.003) 


(0.000; 0.016) 


CLR 


k2 


0.033 


(0.026; 0.041) 


(0.003; 0.104) 


0.037 


(0.035; 0.039) 


(0.002; 0.158) 


CLR 


k4 


0.018 


(0.015; 0.022) 


(0.001; 0.067) 


0.025 


(0.024; 0.026) 


(0.001; 0.102) 


CLR 


klO 


0.009 


(0.008; 0.010) 


(0.001; 0.034) 


0.013 


(0.013; 0.013) 


(0.001; 0.061) 


cor 


LOO 


0.003 


(0.003; 0.004) 


(0.001; 0.020) 


0.005 


(0.005; 0.005) 


(0.000; 0.047) 


cor 


k2 


0.050 


(0.044; 0.057) 


(0.028; 0.094) 


0.065 


(0.064; 0.066) 


(0.007; 0.219) 


cor 


k4 


0.031 


(0.028; 0.034) 


(0.013; 0.080) 


0.042 


(0.041 ; 0.042) 


(0.005; 0.180) 


cor 


klO 


0.018 


(0.017; 0.019) 


(0.007; 0.056) 


0.025 


(0.025; 0.025) 


(0.002; 0.134) 


bicor 


LOO 


0.004 


(0.004; 0.004) 


(0.001; 0.011) 


0.005 


(0.005; 0.005) 


(0.000; 0.027) 


bicor 


k2 


0.046 


(0.041; 0.054) 


(0.026; 0.103) 


0.063 


(0.062; 0.064) 


(0.009; 0.240) 


bicor 


k4 


0.031 


(0.028; 0.034) 


(0.014; 0.093) 


0.042 


(0.042; 0.043) 


(0.005; 0.197) 


bicor 


klO 


0.018 


(0.017; 0.019) 


(0.008; 0.050) 


0.025 


(0.024; 0.025) 


(0.002; 0.122) 


WGCNA 


LOO 


0.002 


(0.002; 0.003) 


(0.000; 0.019) 


0.003 


(0.003; 0.003) 


(0.000; 0.039) 


WGCNA 


k2 


0.037 


(0.032; 0.044) 


(0.009; 0.078) 


0.044 


(0.043; 0.045) 


(0.011; 0.181) 


WGCNA 


k4 


0.021 


(0.019; 0.024) 


(0.006; 0.059) 


0.026 


(0.026; 0.027) 


(0.005; 0.132) 


WGCNA 


klO 


0.013 


(0.012; 0.014) 


(0.003; 0.044) 


0.016 


(0.016; 0.016) 


(0.002; 0.100) 


MINE 


LOO 


0.005 


(0.005; 0.005) 


(0.002; 0.008) 


0.006 


(0.006; 0.006) 


(0.000; 0.019) 


MINE 


k2 


0.132 


(0.123; 0.143) 


(0.059; 0.199) 


0.064 


(0.063; 0.065) 


(0.006; 0.233) 


MINE 


k4 


0.052 


(0.048; 0.058) 


(0.021; 0.126) 


0.041 


(0.040; 0.041) 


(0.004; 0.186) 


MINE 


klO 


0.019 


(0.019; 0.021) 


(0.012; 0.055) 


0.025 


(0.025; 0.025) 


(0.002; 0.121) 


TOM 


LOO 


0.003 


(0.003; 0.004) 


(0.000; 0.023) 


0.004 


(0.004; 0.004) 


(0.000; 0.052) 


TOM 


k2 


0.051 


(0.043; 0.060) 


(0.011; 0.111) 


0.058 


(0.056; 0.059) 


(0.010; 0.253) 


TOM 


k4 


0.029 


(0.026; 0.033) 


(0.007; 0.085) 


0.035 


(0.034; 0.035) 


(0.005; 0.193) 


TOM 


klO 


0.017 


(0.016; 0.018) 


(0.003; 0.060) 


0.021 


(0.021; 0.021) 


(0.003; 0.138) 


Values of the indicators S H im arid S' Hm together 
doi:1 0.1 371 /journal.pone.008981 5.t008 


with bootstrap confidence intervals and range for the inferred networks for different values of data subsampling. 



degree zero in the network CORFDR(10~ 4 ) inferred from the 
whole T, some links connecting them have weight over the 
threshold in several resamplings. Note that the ranking values for 
MIC span a narrow range, with most of the nodes in F\ s in top 
positions, in general yielding a weak relation with the structure of 
M T . 

The weight and degree stability analysis for the other 
subsampling cases (LOO, k2 and k\0) are almost identical and 
thus not shown here. 

miRNA networks for hepatocellular carcinoma 

Investigating the relationships connecting human microRNA 
(miRNA) and how they evolve in cancer is a key issue for 
researchers in biology [53,54]. Hepatocellular carcinoma (HCC) is 
a notable example [55,56]: we test the NetSI indicators on a 
miRNA microarray hepatocellular carcinoma dataset with two 
phenotypes as a tool for differential network analysis. As CLR was 
used in the original paper, we applied this inference method and 
compared its stability with the reconstruction algorithms previ- 
ously employed on the synthetic datasets. 



Data description. The HCC dataset (HCC-B) [31,32] is 
publicly available at the Gene Expression Omnibus (GEO) http:// 
www.ncbi.nlm.nih.gov/geo with accession number GSE6857. The 
dataset collects 482 tissue samples from 241 patients affected by 
HCC. For each patient, a sample from cancerous hepatic tissue 
and a sample from surrounding non-cancerous hepatic tissue are 
available, hybridized on the Ohio State University CCC 
MicroRNA Microarray Version 2.0 platform consisting of 
11,520 probes measuring the expression of 250 non-redundant 
human and 200 mouse miRNAs. After imputation of missing 
values [57], probes corresponding to non-human (mouse and 
controls) miRNAs were discarded; samples for one patient (AN) 
were eliminated. We thus obtained a dataset of 240+240 paired 
samples described by 210 human miRNAs (210 males, 30 
females). Thus HCC-B can be split into four subsets by combining 
the gender and disease status phenotypes, respectively for tissues of 
male cancer patients (MT), female cancer patients (FT) and the 
corresponding non cancer tissues (MnT and FnT). 

For validation, we considered a second dataset (HCC-W) 
recently used to derive a signature of 30 miRNAs for hepatocel- 
lular carcinoma [58]. miRNA expression data for 166 subjects 



PLOS ONE | www.plosone.org 



19 



February 2014 | Volume 9 | Issue 2 | e89815 



Stability Indicators in Network Reconstruction 



Table 9. HCC-B dataset: Shim anc ' 'Shim statistics for MT network. 



■Shim CI Shim Range -Shim CI Shim Range 

Algorithm k S Hm (min; max) (min; max) Shim (min; max) (min; max) 



ARACNE 


LOO 


0.001 


(0.001; 0.002) 


(0.000; 0.008) 


0.002 


(0.002; 0.002) 


(0.000; 0.017) 


ARACNE 


k2 


0.017 


(0.014; 0.020) 


(0.004; 0.039) 


0.012 


(0.012; 0.013) 


(0.002; 0.054) 


ARACNE 


k4 


0.009 


(0.008; 0.010) 


(0.002; 0.021) 


0.008 


(0.008; 0.008) 


(0.001; 0.039) 


ARACNE 


klO 


0.005 


(0.005; 0.006) 


(0.001; 0.015) 


0.005 


(0.005; 0.005) 


(0.001; 0.033) 


CLR 


LOO 


0.002 


(0.002; 0.002) 


(0.000; 0.018) 


0.003 


(0.003; 0.003) 


(0.000; 0.030) 


CLR 


k2 


0.040 


(0.033; 0.051) 


(0.003; 0.146) 


0.051 


(0.048; 0.054) 


(0.003; 0.218) 


CLR 


k\ 


0.024 


(0.020; 0.029) 


(0.002; 0.099) 


0.033 


(0.032; 0.033) 


(0.001; 0.148) 


CLR 


/c 10 


0.011 


(0.010; 0.013) 


(0.001; 0.048) 


0.016 


(0.016; 0.016) 


(0.001; 0.092) 


cor 


LOO 


0.003 


(0.003; 0.004) 


(0.001; 0.013) 


0.005 


(0.005; 0.005) 


(0.000; 0.028) 


cor 


k2 


0.046 


(0.041; 0.054) 


(0.029; 0.105) 


0.063 


(0.062; 0.064) 


(0.008; 0.254) 


cor 


k4 


0.028 


(0.026; 0.030) 


(0.016; 0.052) 


0.038 


(0.037; 0.038) 


(0.005; 0.132) 


cor 


k10 


0.017 


(0.016; 0.018) 


(0.008; 0.041) 


0.023 


(0.023; 0.023) 


(0.002; 0.101) 


bicor 


LOO 


0.004 


(0.003; 0.004) 


(0.001; 0.013) 


0.005 


(0.005; 0.005) 


(0.000; 0.028) 


bicor 


k2 


0.049 


(0.044; 0.058) 


(0.027; 0.112) 


0.067 


(0.066; 0.068) 


(0.009; 0.272) 


bicor 


k4 


0.029 


(0.027; 0.031) 


(0.016; 0.061) 


0.039 


(0.039; 0.040) 


(0.004; 0.148) 


bicor 


klO 


0.018 


(0.017; 0.019) 


(0.009; 0.042) 


0.025 


(0.025; 0.025) 


(0.002; 0.105) 


WGCNA 


LOO 


0.002 


(0.002; 0.002) 


(0.000; 0.009) 


0.003 


(0.003; 0.003) 


(0.000; 0.020) 


WGCNA 


k2 


0.034 


(0.029; 0.040) 


(0.009; 0.075) 


0.038 


(0.037; 0.039) 


(0.008; 0.175) 


WGCNA 


k4 


0.018 


(0.016; 0.021) 


(0.006; 0.046) 


0.022 


(0.022; 0.023) 


(0.004; 0.122) 


WGCNA 


klO 


0.011 


(0.010; 0.012) 


(0.002; 0.028) 


0.013 


(0.013; 0.013) 


(0.002; 0.075) 


MINE 


LOO 


0.006 


(0.006; 0.006) 


(0.002; 0.010) 


0.006 


(0.006; 0.006) 


(0.000; 0.018) 


MINE 


k2 


0.173 


(0.163; 0.183) 


(0.111; 0.242) 


0.060 


(0.059; 0.061) 


(0.004; 0.220) 


MINE 


k4 


0.065 


(0.061; 0.070) 


(0.027; 0.1 1 7) 


0.037 


(0.036; 0.037) 


(0.003; 0.149) 


MINE 


k10 


0.020 


(0.019; 0.021) 


(0.011; 0.046) 


0.024 


(0.024; 0.024) 


(0.001; 0.099) 


TOM 


LOO 


0.003 


(0.003; 0.003) 


(0.000; 0.012) 


0.003 


(0.003; 0.004) 


(0.000; 0.028) 


TOM 


k2 


0.046 


(0.039; 0.054) 


(0.011; 0.102) 


0.049 


(0.047; 0.051) 


(0.008; 0.246) 


TOM 


k4 


0.025 


(0.022; 0.028) 


(0.007; 0.062) 


0.029 


(0.029; 0.030) 


(0.003; 0.168) 


TOM 


k10 


0.015 


(0.013; 0.016) 


(0.003; 0.038) 


0.017 


(0.017; 0.017) 


(0.002; 0.102) 


Values of the indicators S H im arid S' Hm together 
doi:1 0.1 371 /journal.pone.008981 5.t009 


with bootstrap confidence intervals and range for the inferred networks for different values of data subsampling. 



(paired samples for 141 males and 25 females), acquired with the 
CapitalBio custom two-channel microarray platform (692 probes), 
are available at GEO accession number GSE31384. Data are 
processed as normalized differential miRNA expression levels 
between tumors and non cancerous liver tissue data [58]. 

Results. Using the CLR algorithm we first generated the four 
networks inferred from the whole set of datasets and correspond- 
ing to the combinations of the two binary phenotypes, discarding 
links with weight smaller than 0.1. Two different representations of 
the resulting graphs are shown in Fig. 15 and Fig. 16, respectively 
in hairball and hiveplot layouts [59], The second visualization 
technique is particularly useful in highlighting differences between 
networks by disaggregating the network structure according to 
their node degree. The four networks have different structures: 
more high degree links are present for tumoral tissues (graphs for 
MT and FT: Fig. 16 A and C) than for controls (MnT, FnT). 
Their density values, defined as the ratio between the number of 
existing edges and the maximal number of edges for the given 
graph), are 0.0153 (MT), 0.0092 (MnT), 0.0206 (FT) and 0.0121 
(FnT). The mutual HIM distances for the four networks are 
reported in Fig. 17 A, together with the corresponding 



two-dimensional scaling plot (Fig. 17 B). The networks corre- 
sponding to the female patients (and, in particular, the FT inferred 
from cancer tissue) are different from those inferred for the male 
patients. 

In order to explore network reconstruction reliability, we then 
computed the NetSI indicators and compared the results for CLR 
with other six reconstruction algorithms ARACNE, cor, TOM, 
bicor, WGCNA and MIC. The corresponding statistics for the 
four subsets and different subsampling (LOO, kl, k4, and /clO) are 
listed in Tables 6-9 and summarized in Fig. 18. Both the 
resampling strategy and phenotypes have an impact on the 
network stability, differently for the seven methods: the networks 
corresponding to male patients have smaller values for Shim an< A 
5 HIM (and thus they are much more stable) than the correspond- 
ing female counterparts. The leave-one-out stability for FT and 
FnT is worse than for M0 and k4 stability on MT and MnT. 
However phenotypes have a stronger effect than the resampling 
strategy. Note that while control and cancer networks display 
similar stability for males at all levels of subsampling ratio, the FT 
network is more stable than the matching FnT control networks; 
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this is evident when the size of the subset used for inference gets 
smaller, in particular for k2. 

To validate the interest of the stability analysis in terms of 
biological relevance and reproducibility, we applied the S w weight 
stability indicator and ranked accordingly the links for networks 
inferred with CLR on the MT, MnT, FT, FnT subsets. Our 
hypothesis is that, in practical cases, stability may be either an 
effect of real biological relevance of the link or highlight trivial 
relationships between nodes. In the former situation, we expect 
that stable links have a better chance of being reproduced on new 
datasets. However, the comparison between miRNA studies is 
often haunted by the significant changes across releases of the 
annotation datasets. It is not uncommon that a miRNA may get 
discarded or unified with others in a newer release, or that miRNA 
symbols may just be remapped to different sequences. 

As an experiment, we considered the first six top ranked edges 
for the four networks, obtaining seven distinct pairs of miRNAs 
according to the original annotation (Tab. 10). As reported by the 
original GEO platform (accession number GPL4700), the HCC-B 
dataset [31] used the miRBase June 2004 or information collected 
from literature. The two miRNA hsa-mir-321Nol and hsa-mir- 
321No2 had the same mature Sanger registry name in the 
miRBase June 2004, and were eliminated and marked as tRNA in 
the June 20 1 3 version. Not surprisingly they define a link (link (a) 
in the table) which is trivially the most stable for all four networks, 
regardless of the phenotypes. Similarly the pairs of miRNA 
defining the links (b) and (c) can be found to align to the same 
sequences by considering miRBase June 2013. Note that (c) is 



associated with the mature miR21, which is a known oncomir 
[60]. The hsa-mir-219-1 in link (d) is one of the 20 miRNAs in the 
signature proposed by Budhu and colleagues [31]; in our table it is 
differently stable for gender, which is consistent with being 
associated with estrogen regulation and overexpressed in breast 
cancer [61,62]. The link (e) connects hsa-mir-326 with hsa-mir- 
342; the first node is known to be involved in brain tumorigenesis 
[63,64] and it is directly related to microRNA gene expression 
profile of hepatitis C virus-associated hepatocellular carcinoma 
[65]. The second node is associated with different cancers and 
recently proposed as a diagnostic biomarker and therapeutic target 
for HCC [66]. 

Both nodes defining link (f) are associated with cancer [67-69], 
namely with HRC in hepatitis infection and cirrhosis [53,70]. A 
search by sequence with the miRBase web-service [7 1] shows that 
the two probes hsa-mir-192.2.3Nol and hsa-mir-215.precNo have 
a high alignment score (e value: 0.02 [72]). In our analysis, the link 
is very stable in tumours and definitely unstable on controls. 

Link (g) connects hsa-mir-092.prec. 13.092. 1No2 and hsa-mir- 
092.prec.X.092.2: it is the most stable for FT. The two miRNAs 
are to known to be associated with chronic lymphocitic leukemia 
[73], found respectively in genomics regions related to follicular 
lymphoma and advanced ovarian cancer. 

Finally, we compared a disease network inferred from HCC-B 
[31] with the one from HCC-W using the stability indicators, also 
considering the 30 miRNAs signature for hepatocellular carcino- 
ma derived from this dataset [58]. As a preliminary step, data from 
HCC-B were also normalized as ratios (tumor/non-tumor), then 



Groups: FT T FnT A MT ▼ MnT A 

Resamplings: LOO • k2 • k4 • k10 • 
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Figure 18. HCC-B dataset: Shim and S H[M stability indicators of the four subgroups MT, MnT, FT, and FnT. The networks are inferred 
with six different algorithms for different values of data subsampling. MT: Male Tumoral. MnT: Male non Tumoral. FT: Female Tumoral. FnT: Female 
non Tumoral. Confidence intervals are represented for each experiment. Points of increasing dimension are used to represent the diverse resampling 
schema: Leave One Out, A'-fold cross validation for k set to 2 (kl), 4 (A-4) and 10 (A 10) respectively. 
doi:1 0.1 371/journal.pone.008981 5.g01 8 
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Table 10. HCC-B dataset: evaluation of top selected miRNA-miRNA CLR links. 



id 


Node 1 


Node 2 


MT 


MnT 


FT 


FnT 


(a) 


hsa-mir-321No1 


hsa-mir-321 No2 


1 


1 


9 


2 


(b) 


hsa-mir-016b.chr3 


hsa-mir-16.2No1 


3 


12 


15 


309 


(c) 


hsa-mir-021-prec-17No1 


hsa-mir-21 Nol 


27 


5 


2 


921 


(d) 


hsa-mir-219.1No1 


hsa-mir-321 No2 


2 


6 


1903 


314 


(e) 


hsa-mir-326No1 


hsa-mir-342No2 


132 


1017 


3 




(f) 


hsa-mir-192.2.3No1 


hsa-mir-21 5. precNol 


4 


300 


4 


3340 


(g) 


hsa-mir-092.prec.1 3.092.1 No2 


hsa-mir-092.prec.X.092.2 


79 


1158 


1 


404 



Position in the S w ranking for HO in the four cases MT, MnT, FT, and FnT. 
doi:1 0.1 371 /journal.pone.008981 5.t01 0 



we mapped the probes from the first platform into the second with 
the Bowtie aligner v. 0.12.7 [74] with default parameters. A subset 
of 120 matching miRNAs was found, which includes 12 miRNAs 
from Wei's signature. Separately for each dataset, we filtered for 
probes having a homogeneous fold change for at least 80% of the 
samples, obtaining 30 miRNAs for HCC-B and 37 for HCC-W, 
for a total of 54 distinct miRNAs (13 were common). We inferred 
by CLR two disease networks Ng and Nfy on these 54 miRNAs, 
and then computed the weight stability S w with k=l0 cross- 
validation for each network. 

Within the first 10 top ranked links for Nb, half (5) included 
nodes that also belonged to Wei's original signature. Moreover, 
the second most stable link for Ng was also present in the same 
position for the Nw disease network, which also included 4 
miRNAs from Wei's signature in its top 10 most stable edges. The 
common link is formed by the pair (hsa-mir-17,hsa-mir-20a), 
which are both strongly associated with cancer, including 
colorectal cancer [75,76]. It is worth noting that hsa-mir-17 is 
the most strongly associated miRNA to hsa-mir-20a according to 
PhenomiR 2.0 [77], being at distance <10 kb [71]. The same 
range includes hsa-mir92a-l, mapping to the hsa-mir-092.pre- 
c. 13.092. 1No2 probe mentioned above. A concordant overex- 
pression of both hsa-mir-20a and hsa-mir-17 was found for this 
link for both the two networks Nb and Njy. 

Conclusions 

We introduced the NetSI family of indicators for assessing the 
variability of network reconstruction algorithms as functions of a 
data subsampling procedure. Our aim here is to provide the 
researchers with an effective tool to compare either the inference 



algorithms or properties of the investigated dataset. The first two 
indicators are global, giving a confidence measure over the whole 
inferred dataset and are based on a measure of distance between 
networks. In particular, we demonstrated the proposed framework 
with the HIM distance, although it is independent from the chosen 
metric. The other two indicators are local, associating a reliability 
score to the network nodes and the detected links. They are of 
particular interest when coupled with algorithms of proven 
performance, being able to capture the effect of data perturbation 
on the reconstruction process. We demonstrated their consistency 
on two synthetic datasets, testing their dependency on different 
inference methods, and also in comparison with the gold 
standards. Finally we showed an application for disease networks 
on miRNA hepatocellular carcinoma data; we found biological or 
technical evidence for the most stable links in the networks, and 
good reproducibility on a second dataset having relevant 
differences in terms of microrray technology and annotation 
platform. 
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